Overview

Dataset statistics

Number of variables22
Number of observations914
Missing cells1281
Missing cells (%)6.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory203.7 KiB
Average record size in memory228.2 B

Variable types

Categorical14
Numeric8

Alerts

drive_train has constant value "fwd"Constant
name has a high cardinality: 216 distinct valuesHigh cardinality
model has a high cardinality: 87 distinct valuesHigh cardinality
engine_type has a high cardinality: 91 distinct valuesHigh cardinality
fuel_economy has a high cardinality: 158 distinct valuesHigh cardinality
year is highly overall correlated with mileage and 1 other fieldsHigh correlation
mileage is highly overall correlated with yearHigh correlation
fuel_capacity is highly overall correlated with cc_displacement and 8 other fieldsHigh correlation
cc_displacement is highly overall correlated with fuel_capacity and 6 other fieldsHigh correlation
bhp is highly overall correlated with fuel_capacity and 6 other fieldsHigh correlation
torque is highly overall correlated with fuel_capacity and 7 other fieldsHigh correlation
price is highly overall correlated with year and 7 other fieldsHigh correlation
engine_litres is highly overall correlated with fuel_capacity and 8 other fieldsHigh correlation
make is highly overall correlated with fuel_capacity and 3 other fieldsHigh correlation
model is highly overall correlated with fuel_capacity and 11 other fieldsHigh correlation
body_style is highly overall correlated with model and 1 other fieldsHigh correlation
seating_capacity is highly overall correlated with fuel_capacity and 2 other fieldsHigh correlation
fuel_type is highly overall correlated with torque and 2 other fieldsHigh correlation
engine_type is highly overall correlated with fuel_capacity and 12 other fieldsHigh correlation
transmission_gears is highly overall correlated with engine_typeHigh correlation
transmission_type is highly overall correlated with modelHigh correlation
emission_class is highly overall correlated with model and 1 other fieldsHigh correlation
num_owners is highly imbalanced (58.5%)Imbalance
seating_capacity is highly imbalanced (85.9%)Imbalance
fuel_type is highly imbalanced (55.9%)Imbalance
transmission_gears is highly imbalanced (55.7%)Imbalance
engine_litres has 393 (43.0%) missing valuesMissing
drive_train has 888 (97.2%) missing valuesMissing

Reproduction

Analysis started2023-07-17 02:38:52.601756
Analysis finished2023-07-17 02:39:23.498026
Duration30.9 seconds
Software versionpandas-profiling v3.6.6
Download configurationconfig.json

Variables

name
Categorical

Distinct216
Distinct (%)23.6%
Missing0
Missing (%)0.0%
Memory size14.3 KiB
vxi
74 
sportz
 
51
sx
 
30
asta
 
28
alpha
 
27
Other values (211)
704 

Length

Max length26
Median length21
Mean length6.7943107
Min length0

Characters and Unicode

Total characters6210
Distinct characters41
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique90 ?
Unique (%)9.8%

Sample

1st rowhighline at (d)
2nd rowsx
3rd rowvx
4th rowrxt amt
5th rowasta

Common Values

ValueCountFrequency (%)
vxi 74
 
8.1%
sportz 51
 
5.6%
sx 30
 
3.3%
asta 28
 
3.1%
alpha 27
 
3.0%
vx 25
 
2.7%
titanium 23
 
2.5%
magna 23
 
2.5%
vxi amt 19
 
2.1%
v 17
 
1.9%
Other values (206) 597
65.3%

Length

2023-07-16T21:39:24.398322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
vxi 121
 
7.7%
sx 97
 
6.2%
plus 75
 
4.8%
at 74
 
4.7%
sportz 70
 
4.5%
amt 67
 
4.3%
o 61
 
3.9%
asta 58
 
3.7%
zxi 49
 
3.1%
vx 43
 
2.7%
Other values (112) 851
54.3%

Most occurring characters

ValueCountFrequency (%)
t 665
 
10.7%
654
 
10.5%
a 575
 
9.3%
x 442
 
7.1%
i 439
 
7.1%
s 425
 
6.8%
v 291
 
4.7%
p 283
 
4.6%
l 245
 
3.9%
r 244
 
3.9%
Other values (31) 1947
31.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5181
83.4%
Space Separator 654
 
10.5%
Decimal Number 126
 
2.0%
Open Punctuation 90
 
1.4%
Close Punctuation 90
 
1.4%
Math Symbol 47
 
0.8%
Other Punctuation 16
 
0.3%
Dash Punctuation 6
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 665
12.8%
a 575
11.1%
x 442
 
8.5%
i 439
 
8.5%
s 425
 
8.2%
v 291
 
5.6%
p 283
 
5.5%
l 245
 
4.7%
r 244
 
4.7%
o 226
 
4.4%
Other values (15) 1346
26.0%
Decimal Number
ValueCountFrequency (%)
1 38
30.2%
2 20
15.9%
0 18
14.3%
5 17
13.5%
4 16
12.7%
8 14
 
11.1%
6 2
 
1.6%
9 1
 
0.8%
Other Punctuation
ValueCountFrequency (%)
. 11
68.8%
/ 4
 
25.0%
& 1
 
6.2%
Space Separator
ValueCountFrequency (%)
654
100.0%
Open Punctuation
ValueCountFrequency (%)
( 90
100.0%
Close Punctuation
ValueCountFrequency (%)
) 90
100.0%
Math Symbol
ValueCountFrequency (%)
+ 47
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5181
83.4%
Common 1029
 
16.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 665
12.8%
a 575
11.1%
x 442
 
8.5%
i 439
 
8.5%
s 425
 
8.2%
v 291
 
5.6%
p 283
 
5.5%
l 245
 
4.7%
r 244
 
4.7%
o 226
 
4.4%
Other values (15) 1346
26.0%
Common
ValueCountFrequency (%)
654
63.6%
( 90
 
8.7%
) 90
 
8.7%
+ 47
 
4.6%
1 38
 
3.7%
2 20
 
1.9%
0 18
 
1.7%
5 17
 
1.7%
4 16
 
1.6%
8 14
 
1.4%
Other values (6) 25
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6210
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 665
 
10.7%
654
 
10.5%
a 575
 
9.3%
x 442
 
7.1%
i 439
 
7.1%
s 425
 
6.8%
v 291
 
4.7%
p 283
 
4.6%
l 245
 
3.9%
r 244
 
3.9%
Other values (31) 1947
31.4%

make
Categorical

Distinct16
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size14.3 KiB
hyundai
293 
maruti suzuki
285 
honda
73 
renault
66 
ford
46 
Other values (11)
151 

Length

Max length13
Median length10
Mean length8.5568928
Min length3

Characters and Unicode

Total characters7821
Distinct characters24
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowvolkswagen
2nd rowhyundai
3rd rowhonda
4th rowrenault
5th rowhyundai

Common Values

ValueCountFrequency (%)
hyundai 293
32.1%
maruti suzuki 285
31.2%
honda 73
 
8.0%
renault 66
 
7.2%
ford 46
 
5.0%
toyota 30
 
3.3%
volkswagen 27
 
3.0%
tata 23
 
2.5%
mg motors 23
 
2.5%
mahindra 18
 
2.0%
Other values (6) 30
 
3.3%

Length

2023-07-16T21:39:24.818614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
hyundai 293
24.0%
maruti 285
23.3%
suzuki 285
23.3%
honda 73
 
6.0%
renault 66
 
5.4%
ford 46
 
3.8%
toyota 30
 
2.5%
volkswagen 27
 
2.2%
motors 23
 
1.9%
mg 23
 
1.9%
Other values (8) 71
 
5.8%

Most occurring characters

ValueCountFrequency (%)
u 1217
15.6%
i 893
11.4%
a 876
11.2%
t 489
 
6.3%
n 488
 
6.2%
r 444
 
5.7%
d 438
 
5.6%
h 390
 
5.0%
s 351
 
4.5%
m 349
 
4.5%
Other values (14) 1886
24.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7513
96.1%
Space Separator 308
 
3.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 1217
16.2%
i 893
11.9%
a 876
11.7%
t 489
 
6.5%
n 488
 
6.5%
r 444
 
5.9%
d 438
 
5.8%
h 390
 
5.2%
s 351
 
4.7%
m 349
 
4.6%
Other values (13) 1578
21.0%
Space Separator
ValueCountFrequency (%)
308
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7513
96.1%
Common 308
 
3.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 1217
16.2%
i 893
11.9%
a 876
11.7%
t 489
 
6.5%
n 488
 
6.5%
r 444
 
5.9%
d 438
 
5.8%
h 390
 
5.2%
s 351
 
4.7%
m 349
 
4.6%
Other values (13) 1578
21.0%
Common
ValueCountFrequency (%)
308
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7821
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
u 1217
15.6%
i 893
11.4%
a 876
11.2%
t 489
 
6.3%
n 488
 
6.2%
r 444
 
5.7%
d 438
 
5.6%
h 390
 
5.0%
s 351
 
4.5%
m 349
 
4.5%
Other values (14) 1886
24.1%

model
Categorical

HIGH CARDINALITY  HIGH CORRELATION 

Distinct87
Distinct (%)9.5%
Missing0
Missing (%)0.0%
Memory size14.3 KiB
elite i20
 
64
i10
 
36
grand i10
 
35
baleno
 
32
verna
 
32
Other values (82)
715 

Length

Max length16
Median length14
Mean length6.3041575
Min length2

Characters and Unicode

Total characters5762
Distinct characters35
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15 ?
Unique (%)1.6%

Sample

1st rowameo
2nd rowi20 active
3rd rowwr-v
4th rowkwid
5th rowgrand i10

Common Values

ValueCountFrequency (%)
elite i20 64
 
7.0%
i10 36
 
3.9%
grand i10 35
 
3.8%
baleno 32
 
3.5%
verna 32
 
3.5%
ciaz 30
 
3.3%
kwid 28
 
3.1%
city 28
 
3.1%
ecosport 27
 
3.0%
alto k10 27
 
3.0%
Other values (77) 575
62.9%

Length

2023-07-16T21:39:25.415320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
i20 93
 
7.8%
i10 76
 
6.4%
elite 64
 
5.4%
alto 53
 
4.5%
grand 40
 
3.4%
verna 38
 
3.2%
baleno 32
 
2.7%
swift 32
 
2.7%
dzire 31
 
2.6%
ciaz 30
 
2.5%
Other values (83) 698
58.8%

Most occurring characters

ValueCountFrequency (%)
e 622
 
10.8%
i 596
 
10.3%
r 433
 
7.5%
a 418
 
7.3%
t 414
 
7.2%
o 390
 
6.8%
0 276
 
4.8%
273
 
4.7%
c 243
 
4.2%
n 236
 
4.1%
Other values (25) 1861
32.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4895
85.0%
Decimal Number 538
 
9.3%
Space Separator 273
 
4.7%
Dash Punctuation 35
 
0.6%
Other Punctuation 21
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 622
12.7%
i 596
12.2%
r 433
8.8%
a 418
 
8.5%
t 414
 
8.5%
o 390
 
8.0%
c 243
 
5.0%
n 236
 
4.8%
s 235
 
4.8%
l 234
 
4.8%
Other values (15) 1074
21.9%
Decimal Number
ValueCountFrequency (%)
0 276
51.3%
1 130
24.2%
2 94
 
17.5%
8 25
 
4.6%
5 5
 
0.9%
4 5
 
0.9%
3 3
 
0.6%
Space Separator
ValueCountFrequency (%)
273
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 35
100.0%
Other Punctuation
ValueCountFrequency (%)
. 21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4895
85.0%
Common 867
 
15.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 622
12.7%
i 596
12.2%
r 433
8.8%
a 418
 
8.5%
t 414
 
8.5%
o 390
 
8.0%
c 243
 
5.0%
n 236
 
4.8%
s 235
 
4.8%
l 234
 
4.8%
Other values (15) 1074
21.9%
Common
ValueCountFrequency (%)
0 276
31.8%
273
31.5%
1 130
15.0%
2 94
 
10.8%
- 35
 
4.0%
8 25
 
2.9%
. 21
 
2.4%
5 5
 
0.6%
4 5
 
0.6%
3 3
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5762
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 622
 
10.8%
i 596
 
10.3%
r 433
 
7.5%
a 418
 
7.3%
t 414
 
7.2%
o 390
 
6.8%
0 276
 
4.8%
273
 
4.7%
c 243
 
4.2%
n 236
 
4.1%
Other values (25) 1861
32.3%

year
Real number (ℝ)

Distinct12
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2016.8906
Minimum2011
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.2 KiB
2023-07-16T21:39:25.828424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2011
5-th percentile2012
Q12015
median2017
Q32019
95-th percentile2021
Maximum2022
Range11
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.816214
Coefficient of variation (CV)0.0013963147
Kurtosis-0.73224576
Mean2016.8906
Median Absolute Deviation (MAD)2
Skewness-0.35188597
Sum1843438
Variance7.9310614
MonotonicityNot monotonic
2023-07-16T21:39:26.282697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2017 129
14.1%
2020 113
12.4%
2019 112
12.3%
2018 110
12.0%
2016 88
9.6%
2015 83
9.1%
2014 71
7.8%
2021 65
7.1%
2013 51
 
5.6%
2011 42
 
4.6%
Other values (2) 50
 
5.5%
ValueCountFrequency (%)
2011 42
 
4.6%
2012 36
 
3.9%
2013 51
 
5.6%
2014 71
7.8%
2015 83
9.1%
2016 88
9.6%
2017 129
14.1%
2018 110
12.0%
2019 112
12.3%
2020 113
12.4%
ValueCountFrequency (%)
2022 14
 
1.5%
2021 65
7.1%
2020 113
12.4%
2019 112
12.3%
2018 110
12.0%
2017 129
14.1%
2016 88
9.6%
2015 83
9.1%
2014 71
7.8%
2013 51
 
5.6%

color
Categorical

Distinct15
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size14.3 KiB
white
188 
red
167 
grey
156 
silver
126 
blue
122 
Other values (10)
155 

Length

Max length6
Median length5
Mean length4.5196937
Min length3

Characters and Unicode

Total characters4131
Distinct characters22
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowsilver
2nd rowred
3rd rowwhite
4th rowbronze
5th roworange

Common Values

ValueCountFrequency (%)
white 188
20.6%
red 167
18.3%
grey 156
17.1%
silver 126
13.8%
blue 122
13.3%
brown 62
 
6.8%
black 35
 
3.8%
orange 19
 
2.1%
bronze 12
 
1.3%
beige 9
 
1.0%
Other values (5) 18
 
2.0%

Length

2023-07-16T21:39:26.677290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
white 188
20.6%
red 167
18.3%
grey 156
17.1%
silver 126
13.8%
blue 122
13.3%
brown 62
 
6.8%
black 35
 
3.8%
orange 19
 
2.1%
bronze 12
 
1.3%
beige 9
 
1.0%
Other values (5) 18
 
2.0%

Most occurring characters

ValueCountFrequency (%)
e 825
20.0%
r 551
13.3%
i 323
 
7.8%
l 302
 
7.3%
w 256
 
6.2%
b 240
 
5.8%
g 189
 
4.6%
t 188
 
4.6%
h 188
 
4.6%
d 170
 
4.1%
Other values (12) 899
21.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4131
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 825
20.0%
r 551
13.3%
i 323
 
7.8%
l 302
 
7.3%
w 256
 
6.2%
b 240
 
5.8%
g 189
 
4.6%
t 188
 
4.6%
h 188
 
4.6%
d 170
 
4.1%
Other values (12) 899
21.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 4131
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 825
20.0%
r 551
13.3%
i 323
 
7.8%
l 302
 
7.3%
w 256
 
6.2%
b 240
 
5.8%
g 189
 
4.6%
t 188
 
4.6%
h 188
 
4.6%
d 170
 
4.1%
Other values (12) 899
21.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4131
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 825
20.0%
r 551
13.3%
i 323
 
7.8%
l 302
 
7.3%
w 256
 
6.2%
b 240
 
5.8%
g 189
 
4.6%
t 188
 
4.6%
h 188
 
4.6%
d 170
 
4.1%
Other values (12) 899
21.8%

body_style
Categorical

Distinct5
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size14.3 KiB
hatchback
462 
sedan
207 
suv
186 
muv
 
38
crossover
 
21

Length

Max length9
Median length9
Mean length6.6236324
Min length3

Characters and Unicode

Total characters6054
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowsedan
2nd rowcrossover
3rd rowsuv
4th rowhatchback
5th rowhatchback

Common Values

ValueCountFrequency (%)
hatchback 462
50.5%
sedan 207
22.6%
suv 186
20.4%
muv 38
 
4.2%
crossover 21
 
2.3%

Length

2023-07-16T21:39:27.099818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-16T21:39:27.450307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
hatchback 462
50.5%
sedan 207
22.6%
suv 186
20.4%
muv 38
 
4.2%
crossover 21
 
2.3%

Most occurring characters

ValueCountFrequency (%)
a 1131
18.7%
c 945
15.6%
h 924
15.3%
t 462
7.6%
b 462
7.6%
k 462
7.6%
s 435
 
7.2%
v 245
 
4.0%
e 228
 
3.8%
u 224
 
3.7%
Other values (5) 536
8.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6054
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1131
18.7%
c 945
15.6%
h 924
15.3%
t 462
7.6%
b 462
7.6%
k 462
7.6%
s 435
 
7.2%
v 245
 
4.0%
e 228
 
3.8%
u 224
 
3.7%
Other values (5) 536
8.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 6054
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1131
18.7%
c 945
15.6%
h 924
15.3%
t 462
7.6%
b 462
7.6%
k 462
7.6%
s 435
 
7.2%
v 245
 
4.0%
e 228
 
3.8%
u 224
 
3.7%
Other values (5) 536
8.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6054
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1131
18.7%
c 945
15.6%
h 924
15.3%
t 462
7.6%
b 462
7.6%
k 462
7.6%
s 435
 
7.2%
v 245
 
4.0%
e 228
 
3.8%
u 224
 
3.7%
Other values (5) 536
8.9%

mileage
Real number (ℝ)

Distinct878
Distinct (%)96.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean41728.979
Minimum1117
Maximum99495
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.2 KiB
2023-07-16T21:39:27.945909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1117
5-th percentile7872.95
Q122559
median38145
Q358017.5
95-th percentile88735
Maximum99495
Range98378
Interquartile range (IQR)35458.5

Descriptive statistics

Standard deviation24460.833
Coefficient of variation (CV)0.58618336
Kurtosis-0.67034938
Mean41728.979
Median Absolute Deviation (MAD)17731.5
Skewness0.49571404
Sum38140287
Variance5.9833237 × 108
MonotonicityNot monotonic
2023-07-16T21:39:28.443494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
59680 2
 
0.2%
45378 2
 
0.2%
56221 2
 
0.2%
30234 2
 
0.2%
31262 2
 
0.2%
42200 2
 
0.2%
50009 2
 
0.2%
17246 2
 
0.2%
6738 2
 
0.2%
22538 2
 
0.2%
Other values (868) 894
97.8%
ValueCountFrequency (%)
1117 1
0.1%
1540 1
0.1%
2163 1
0.1%
2174 1
0.1%
3174 1
0.1%
3474 1
0.1%
3679 1
0.1%
3832 1
0.1%
4305 1
0.1%
4337 1
0.1%
ValueCountFrequency (%)
99495 2
0.2%
99144 1
0.1%
99001 1
0.1%
98109 1
0.1%
97902 1
0.1%
97573 1
0.1%
97039 2
0.2%
96939 1
0.1%
96471 1
0.1%
96236 1
0.1%

num_owners
Categorical

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size14.3 KiB
1
770 
2
140 
3
 
4

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters914
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row2
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 770
84.2%
2 140
 
15.3%
3 4
 
0.4%

Length

2023-07-16T21:39:28.817720image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-16T21:39:29.054162image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 770
84.2%
2 140
 
15.3%
3 4
 
0.4%

Most occurring characters

ValueCountFrequency (%)
1 770
84.2%
2 140
 
15.3%
3 4
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 914
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 770
84.2%
2 140
 
15.3%
3 4
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
Common 914
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 770
84.2%
2 140
 
15.3%
3 4
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 914
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 770
84.2%
2 140
 
15.3%
3 4
 
0.4%

seating_capacity
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size14.3 KiB
5
868 
7
 
39
8
 
5
6
 
1
4
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters914
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)0.2%

Sample

1st row5
2nd row5
3rd row5
4th row5
5th row5

Common Values

ValueCountFrequency (%)
5 868
95.0%
7 39
 
4.3%
8 5
 
0.5%
6 1
 
0.1%
4 1
 
0.1%

Length

2023-07-16T21:39:29.346144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-16T21:39:29.620692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
5 868
95.0%
7 39
 
4.3%
8 5
 
0.5%
6 1
 
0.1%
4 1
 
0.1%

Most occurring characters

ValueCountFrequency (%)
5 868
95.0%
7 39
 
4.3%
8 5
 
0.5%
6 1
 
0.1%
4 1
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 914
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 868
95.0%
7 39
 
4.3%
8 5
 
0.5%
6 1
 
0.1%
4 1
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 914
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 868
95.0%
7 39
 
4.3%
8 5
 
0.5%
6 1
 
0.1%
4 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 914
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 868
95.0%
7 39
 
4.3%
8 5
 
0.5%
6 1
 
0.1%
4 1
 
0.1%

fuel_type
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size14.3 KiB
p
749 
d
163 
pc
 
2

Length

Max length2
Median length1
Mean length1.0021882
Min length1

Characters and Unicode

Total characters916
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowd
2nd rowp
3rd rowp
4th rowp
5th rowp

Common Values

ValueCountFrequency (%)
p 749
81.9%
d 163
 
17.8%
pc 2
 
0.2%

Length

2023-07-16T21:39:29.908180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-16T21:39:30.731555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
p 749
81.9%
d 163
 
17.8%
pc 2
 
0.2%

Most occurring characters

ValueCountFrequency (%)
p 751
82.0%
d 163
 
17.8%
c 2
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 916
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
p 751
82.0%
d 163
 
17.8%
c 2
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 916
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
p 751
82.0%
d 163
 
17.8%
c 2
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 916
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
p 751
82.0%
d 163
 
17.8%
c 2
 
0.2%

fuel_capacity
Real number (ℝ)

Distinct21
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean41.966083
Minimum15
Maximum70
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.2 KiB
2023-07-16T21:39:30.993614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum15
5-th percentile32
Q135
median42
Q345
95-th percentile60
Maximum70
Range55
Interquartile range (IQR)10

Descriptive statistics

Standard deviation7.8967841
Coefficient of variation (CV)0.18817062
Kurtosis0.9447582
Mean41.966083
Median Absolute Deviation (MAD)5
Skewness0.72399891
Sum38357
Variance62.359199
MonotonicityNot monotonic
2023-07-16T21:39:31.309701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
35 160
17.5%
45 154
16.8%
43 134
14.7%
40 96
10.5%
37 89
9.7%
60 62
 
6.8%
42 44
 
4.8%
50 35
 
3.8%
28 29
 
3.2%
52 27
 
3.0%
Other values (11) 84
9.2%
ValueCountFrequency (%)
15 1
 
0.1%
24 1
 
0.1%
27 14
 
1.5%
28 29
 
3.2%
32 27
 
3.0%
35 160
17.5%
36 4
 
0.4%
37 89
9.7%
40 96
10.5%
41 2
 
0.2%
ValueCountFrequency (%)
70 5
 
0.5%
66 1
 
0.1%
60 62
6.8%
55 13
 
1.4%
52 27
 
3.0%
50 35
 
3.8%
48 14
 
1.5%
45 154
16.8%
44 2
 
0.2%
43 134
14.7%

engine_type
Categorical

HIGH CARDINALITY  HIGH CORRELATION 

Distinct91
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size14.3 KiB
119 
kappa vtvt
77 
k10b
60 
i-vtec
 
33
vvt
 
33
Other values (86)
592 

Length

Max length95
Median length24
Mean length8.6148796
Min length0

Characters and Unicode

Total characters7874
Distinct characters39
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique25 ?
Unique (%)2.7%

Sample

1st rowtdi
2nd rowkappa
3rd rowi-vtec
4th row
5th rowkappa vtvt

Common Values

ValueCountFrequency (%)
119
 
13.0%
kappa vtvt 77
 
8.4%
k10b 60
 
6.6%
i-vtec 33
 
3.6%
vvt 33
 
3.6%
kappa 32
 
3.5%
f8d 25
 
2.7%
k series 24
 
2.6%
fwd 23
 
2.5%
mpi 23
 
2.5%
Other values (81) 465
50.9%

Length

2023-07-16T21:39:31.666559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
kappa 159
 
9.9%
vtvt 149
 
9.2%
4 79
 
4.9%
cylinder 78
 
4.8%
dual 64
 
4.0%
k10b 60
 
3.7%
vvt 57
 
3.5%
i-vtec 44
 
2.7%
series 37
 
2.3%
k 35
 
2.2%
Other values (98) 850
52.7%

Most occurring characters

ValueCountFrequency (%)
819
 
10.4%
t 702
 
8.9%
v 618
 
7.8%
i 560
 
7.1%
a 527
 
6.7%
e 448
 
5.7%
d 434
 
5.5%
p 402
 
5.1%
k 355
 
4.5%
r 339
 
4.3%
Other values (29) 2670
33.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6427
81.6%
Space Separator 819
 
10.4%
Decimal Number 498
 
6.3%
Dash Punctuation 121
 
1.5%
Other Punctuation 9
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 702
10.9%
v 618
 
9.6%
i 560
 
8.7%
a 527
 
8.2%
e 448
 
7.0%
d 434
 
6.8%
p 402
 
6.3%
k 355
 
5.5%
r 339
 
5.3%
c 336
 
5.2%
Other values (16) 1706
26.5%
Decimal Number
ValueCountFrequency (%)
1 133
26.7%
4 121
24.3%
0 78
15.7%
2 63
12.7%
6 34
 
6.8%
8 25
 
5.0%
5 25
 
5.0%
3 11
 
2.2%
9 8
 
1.6%
Other Punctuation
ValueCountFrequency (%)
. 8
88.9%
/ 1
 
11.1%
Space Separator
ValueCountFrequency (%)
819
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 121
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6427
81.6%
Common 1447
 
18.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 702
10.9%
v 618
 
9.6%
i 560
 
8.7%
a 527
 
8.2%
e 448
 
7.0%
d 434
 
6.8%
p 402
 
6.3%
k 355
 
5.5%
r 339
 
5.3%
c 336
 
5.2%
Other values (16) 1706
26.5%
Common
ValueCountFrequency (%)
819
56.6%
1 133
 
9.2%
- 121
 
8.4%
4 121
 
8.4%
0 78
 
5.4%
2 63
 
4.4%
6 34
 
2.3%
8 25
 
1.7%
5 25
 
1.7%
3 11
 
0.8%
Other values (3) 17
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7874
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
819
 
10.4%
t 702
 
8.9%
v 618
 
7.8%
i 560
 
7.1%
a 527
 
6.7%
e 448
 
5.7%
d 434
 
5.5%
p 402
 
5.1%
k 355
 
4.5%
r 339
 
4.3%
Other values (29) 2670
33.9%

cc_displacement
Real number (ℝ)

Distinct38
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1283.3742
Minimum624
Maximum2179
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.2 KiB
2023-07-16T21:39:31.961002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum624
5-th percentile814
Q11197
median1197
Q31496
95-th percentile1798
Maximum2179
Range1555
Interquartile range (IQR)299

Descriptive statistics

Standard deviation272.76855
Coefficient of variation (CV)0.21254016
Kurtosis1.0345116
Mean1283.3742
Median Absolute Deviation (MAD)198
Skewness0.71994074
Sum1173004
Variance74402.683
MonotonicityNot monotonic
2023-07-16T21:39:32.405539image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
1197 260
28.4%
998 97
 
10.6%
1199 44
 
4.8%
1591 34
 
3.7%
1248 33
 
3.6%
1596 31
 
3.4%
1497 30
 
3.3%
999 30
 
3.3%
1498 29
 
3.2%
796 28
 
3.1%
Other values (28) 298
32.6%
ValueCountFrequency (%)
624 2
 
0.2%
796 28
 
3.1%
799 11
 
1.2%
814 15
 
1.6%
936 2
 
0.2%
998 97
10.6%
999 30
 
3.3%
1086 14
 
1.5%
1193 1
 
0.1%
1194 2
 
0.2%
ValueCountFrequency (%)
2179 15
1.6%
1991 2
 
0.2%
1968 1
 
0.1%
1956 17
1.9%
1798 13
 
1.4%
1696 10
 
1.1%
1599 1
 
0.1%
1598 5
 
0.5%
1596 31
3.4%
1591 34
3.7%

transmission_gears
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size14.3 KiB
5
707 
6
153 
4
 
35
7
 
16
c
 
3

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters914
Distinct characters5
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row7
2nd row5
3rd row5
4th row5
5th row5

Common Values

ValueCountFrequency (%)
5 707
77.4%
6 153
 
16.7%
4 35
 
3.8%
7 16
 
1.8%
c 3
 
0.3%

Length

2023-07-16T21:39:33.471544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-16T21:39:34.100510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
5 707
77.4%
6 153
 
16.7%
4 35
 
3.8%
7 16
 
1.8%
c 3
 
0.3%

Most occurring characters

ValueCountFrequency (%)
5 707
77.4%
6 153
 
16.7%
4 35
 
3.8%
7 16
 
1.8%
c 3
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 911
99.7%
Lowercase Letter 3
 
0.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 707
77.6%
6 153
 
16.8%
4 35
 
3.8%
7 16
 
1.8%
Lowercase Letter
ValueCountFrequency (%)
c 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 911
99.7%
Latin 3
 
0.3%

Most frequent character per script

Common
ValueCountFrequency (%)
5 707
77.6%
6 153
 
16.8%
4 35
 
3.8%
7 16
 
1.8%
Latin
ValueCountFrequency (%)
c 3
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 914
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 707
77.4%
6 153
 
16.7%
4 35
 
3.8%
7 16
 
1.8%
c 3
 
0.3%
Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size14.3 KiB
m
653 
a
261 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters914
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowa
2nd rowm
3rd rowm
4th rowm
5th rowm

Common Values

ValueCountFrequency (%)
m 653
71.4%
a 261
 
28.6%

Length

2023-07-16T21:39:34.740093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-16T21:39:35.079088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
m 653
71.4%
a 261
 
28.6%

Most occurring characters

ValueCountFrequency (%)
m 653
71.4%
a 261
 
28.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 914
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
m 653
71.4%
a 261
 
28.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 914
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
m 653
71.4%
a 261
 
28.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 914
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
m 653
71.4%
a 261
 
28.6%

bhp
Real number (ℝ)

Distinct105
Distinct (%)11.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean90.030175
Minimum34
Maximum177
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.3 KiB
2023-07-16T21:39:35.367376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum34
5-th percentile55.2
Q174
median82
Q3103.25
95-th percentile138
Maximum177
Range143
Interquartile range (IQR)29.25

Descriptive statistics

Standard deviation24.595367
Coefficient of variation (CV)0.27319026
Kurtosis0.98619333
Mean90.030175
Median Absolute Deviation (MAD)13.93
Skewness0.87533373
Sum82287.58
Variance604.93205
MonotonicityNot monotonic
2023-07-16T21:39:35.847288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
82 51
 
5.6%
81.8 49
 
5.4%
67 48
 
5.3%
81.86 40
 
4.4%
88.5 30
 
3.3%
81.83 26
 
2.8%
89 26
 
2.8%
78.9 25
 
2.7%
67.04 25
 
2.7%
85 24
 
2.6%
Other values (95) 570
62.4%
ValueCountFrequency (%)
34 4
 
0.4%
35 1
 
0.1%
37 1
 
0.1%
46.3 1
 
0.1%
47.3 20
2.2%
47.33 3
 
0.3%
48 2
 
0.2%
53 3
 
0.3%
53.3 7
 
0.8%
53.64 1
 
0.1%
ValueCountFrequency (%)
177 1
 
0.1%
170.63 1
 
0.1%
170 3
 
0.3%
167.68 12
1.3%
167.67 1
 
0.1%
164 2
 
0.2%
158 1
 
0.1%
141 11
1.2%
140 6
0.7%
138.13 1
 
0.1%

torque
Real number (ℝ)

Distinct79
Distinct (%)8.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean140.75864
Minimum48
Maximum380
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.3 KiB
2023-07-16T21:39:36.641113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum48
5-th percentile74.5
Q1109
median114
Q3154
95-th percentile265
Maximum380
Range332
Interquartile range (IQR)45

Descriptive statistics

Standard deviation62.7486
Coefficient of variation (CV)0.44578861
Kurtosis2.5126335
Mean140.75864
Median Absolute Deviation (MAD)24
Skewness1.671263
Sum128653.4
Variance3937.3868
MonotonicityNot monotonic
2023-07-16T21:39:36.994513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
113 104
 
11.4%
90 81
 
8.9%
110 40
 
4.4%
114 39
 
4.3%
113.75 34
 
3.7%
115 32
 
3.5%
200 30
 
3.3%
145 29
 
3.2%
114.7 26
 
2.8%
69 25
 
2.7%
Other values (69) 474
51.9%
ValueCountFrequency (%)
48 1
 
0.1%
51 1
 
0.1%
59 4
 
0.4%
62 1
 
0.1%
69 25
 
2.7%
72 11
 
1.2%
74.5 12
 
1.3%
78 1
 
0.1%
90 81
8.9%
91 18
 
2.0%
ValueCountFrequency (%)
380 2
 
0.2%
350 23
2.5%
330 5
 
0.5%
320 3
 
0.3%
300 2
 
0.2%
280 9
 
1.0%
265 9
 
1.0%
260 9
 
1.0%
250 17
1.9%
248 3
 
0.3%

fuel_economy
Categorical

Distinct158
Distinct (%)17.3%
Missing0
Missing (%)0.0%
Memory size14.3 KiB
18.6
 
59
18.9
 
51
21.01
 
39
18.5
 
37
23.1
 
23
Other values (153)
705 

Length

Max length18
Median length5
Mean length4.3719912
Min length2

Characters and Unicode

Total characters3996
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique47 ?
Unique (%)5.1%

Sample

1st row21.66
2nd row17.19
3rd row16.5
4th row21.7
5th row18.9

Common Values

ValueCountFrequency (%)
18.6 59
 
6.5%
18.9 51
 
5.6%
21.01 39
 
4.3%
18.5 37
 
4.0%
23.1 23
 
2.5%
20.36 22
 
2.4%
17.4 21
 
2.3%
22.05 15
 
1.6%
17.01 15
 
1.6%
19.1 15
 
1.6%
Other values (148) 617
67.5%

Length

2023-07-16T21:39:37.387288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
18.6 59
 
6.4%
18.9 51
 
5.5%
21.01 39
 
4.2%
18.5 37
 
4.0%
23.1 23
 
2.5%
20.36 22
 
2.4%
17.4 21
 
2.3%
22.05 15
 
1.6%
17.01 15
 
1.6%
19.1 15
 
1.6%
Other values (153) 623
67.7%

Most occurring characters

ValueCountFrequency (%)
1 881
22.0%
. 861
21.5%
2 545
13.6%
8 268
 
6.7%
0 261
 
6.5%
7 248
 
6.2%
6 219
 
5.5%
9 201
 
5.0%
5 177
 
4.4%
4 172
 
4.3%
Other values (11) 163
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3114
77.9%
Other Punctuation 862
 
21.6%
Lowercase Letter 14
 
0.4%
Space Separator 6
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 881
28.3%
2 545
17.5%
8 268
 
8.6%
0 261
 
8.4%
7 248
 
8.0%
6 219
 
7.0%
9 201
 
6.5%
5 177
 
5.7%
4 172
 
5.5%
3 142
 
4.6%
Lowercase Letter
ValueCountFrequency (%)
b 3
21.4%
s 2
14.3%
i 2
14.3%
v 2
14.3%
p 2
14.3%
h 1
 
7.1%
r 1
 
7.1%
m 1
 
7.1%
Other Punctuation
ValueCountFrequency (%)
. 861
99.9%
@ 1
 
0.1%
Space Separator
ValueCountFrequency (%)
6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3982
99.6%
Latin 14
 
0.4%

Most frequent character per script

Common
ValueCountFrequency (%)
1 881
22.1%
. 861
21.6%
2 545
13.7%
8 268
 
6.7%
0 261
 
6.6%
7 248
 
6.2%
6 219
 
5.5%
9 201
 
5.0%
5 177
 
4.4%
4 172
 
4.3%
Other values (3) 149
 
3.7%
Latin
ValueCountFrequency (%)
b 3
21.4%
s 2
14.3%
i 2
14.3%
v 2
14.3%
p 2
14.3%
h 1
 
7.1%
r 1
 
7.1%
m 1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3996
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 881
22.0%
. 861
21.5%
2 545
13.6%
8 268
 
6.7%
0 261
 
6.5%
7 248
 
6.2%
6 219
 
5.5%
9 201
 
5.0%
5 177
 
4.4%
4 172
 
4.3%
Other values (11) 163
 
4.1%

emission_class
Categorical

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size14.3 KiB
4
552 
5
281 
3
81 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters914
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row5
3rd row4
4th row4
5th row5

Common Values

ValueCountFrequency (%)
4 552
60.4%
5 281
30.7%
3 81
 
8.9%

Length

2023-07-16T21:39:37.823217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-16T21:39:38.232176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
4 552
60.4%
5 281
30.7%
3 81
 
8.9%

Most occurring characters

ValueCountFrequency (%)
4 552
60.4%
5 281
30.7%
3 81
 
8.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 914
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 552
60.4%
5 281
30.7%
3 81
 
8.9%

Most occurring scripts

ValueCountFrequency (%)
Common 914
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 552
60.4%
5 281
30.7%
3 81
 
8.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 914
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 552
60.4%
5 281
30.7%
3 81
 
8.9%

price
Real number (ℝ)

Distinct585
Distinct (%)64.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean721344.64
Minimum188000
Maximum2941000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.2 KiB
2023-07-16T21:39:38.568680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum188000
5-th percentile329300
Q1466750
median652000
Q3861500
95-th percentile1450100
Maximum2941000
Range2753000
Interquartile range (IQR)394750

Descriptive statistics

Standard deviation348866.82
Coefficient of variation (CV)0.48363404
Kurtosis3.8759316
Mean721344.64
Median Absolute Deviation (MAD)194500
Skewness1.632434
Sum6.59309 × 108
Variance1.2170806 × 1011
MonotonicityNot monotonic
2023-07-16T21:39:38.961733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
790000 6
 
0.7%
699000 6
 
0.7%
914000 5
 
0.5%
625000 5
 
0.5%
544000 5
 
0.5%
396000 5
 
0.5%
589000 5
 
0.5%
651000 4
 
0.4%
694000 4
 
0.4%
628000 4
 
0.4%
Other values (575) 865
94.6%
ValueCountFrequency (%)
188000 1
0.1%
237000 1
0.1%
239000 1
0.1%
245000 1
0.1%
248000 1
0.1%
255000 1
0.1%
262000 1
0.1%
266000 1
0.1%
267000 1
0.1%
269000 1
0.1%
ValueCountFrequency (%)
2941000 1
0.1%
2100000 1
0.1%
2019000 2
0.2%
1984000 2
0.2%
1978000 2
0.2%
1972000 1
0.1%
1958000 2
0.2%
1912000 1
0.1%
1875000 1
0.1%
1848000 2
0.2%

engine_litres
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct10
Distinct (%)1.9%
Missing393
Missing (%)43.0%
Infinite0
Infinite (%)0.0%
Mean1.3036468
Minimum0.8
Maximum2
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.3 KiB
2023-07-16T21:39:39.343852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.8
5-th percentile1
Q11.2
median1.2
Q31.5
95-th percentile1.6
Maximum2
Range1.2
Interquartile range (IQR)0.3

Descriptive statistics

Standard deviation0.2447254
Coefficient of variation (CV)0.18772369
Kurtosis0.7868113
Mean1.3036468
Median Absolute Deviation (MAD)0.2
Skewness0.92571506
Sum679.2
Variance0.059890521
MonotonicityNot monotonic
2023-07-16T21:39:39.671873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
1.2 242
26.5%
1.5 87
 
9.5%
1 79
 
8.6%
1.6 58
 
6.3%
2 23
 
2.5%
1.4 16
 
1.8%
1.1 6
 
0.7%
1.3 5
 
0.5%
0.8 4
 
0.4%
1.8 1
 
0.1%
(Missing) 393
43.0%
ValueCountFrequency (%)
0.8 4
 
0.4%
1 79
 
8.6%
1.1 6
 
0.7%
1.2 242
26.5%
1.3 5
 
0.5%
1.4 16
 
1.8%
1.5 87
 
9.5%
1.6 58
 
6.3%
1.8 1
 
0.1%
2 23
 
2.5%
ValueCountFrequency (%)
2 23
 
2.5%
1.8 1
 
0.1%
1.6 58
 
6.3%
1.5 87
 
9.5%
1.4 16
 
1.8%
1.3 5
 
0.5%
1.2 242
26.5%
1.1 6
 
0.7%
1 79
 
8.6%
0.8 4
 
0.4%

drive_train
Categorical

CONSTANT  MISSING 

Distinct1
Distinct (%)3.8%
Missing888
Missing (%)97.2%
Memory size14.3 KiB
fwd
26 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters78
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfwd
2nd rowfwd
3rd rowfwd
4th rowfwd
5th rowfwd

Common Values

ValueCountFrequency (%)
fwd 26
 
2.8%
(Missing) 888
97.2%

Length

2023-07-16T21:39:40.002422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-07-16T21:39:40.274607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
fwd 26
100.0%

Most occurring characters

ValueCountFrequency (%)
f 26
33.3%
w 26
33.3%
d 26
33.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 78
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
f 26
33.3%
w 26
33.3%
d 26
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 78
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
f 26
33.3%
w 26
33.3%
d 26
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 78
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
f 26
33.3%
w 26
33.3%
d 26
33.3%

Interactions

2023-07-16T21:39:18.460068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:38:58.421040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:01.525428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:03.872768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:06.335313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:09.526090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:12.729161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:15.781402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:18.819479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:38:59.265418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:01.770498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:04.213637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:06.781625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:09.928613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:13.078720image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:16.057729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:19.067922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:38:59.577413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:02.032123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:04.567018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:07.053455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:10.330397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:13.402550image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:16.334348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:19.321770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:38:59.840813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:02.354165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:04.841291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:07.447780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:11.164289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:13.794647image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:16.731558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:19.658767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:00.196651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:02.665800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:05.115947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:07.851979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:11.488576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:14.607355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:17.050879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:20.001022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:00.665868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:03.007985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:05.427598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:08.178248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:11.773304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:14.951892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:17.382477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:20.336205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:00.983527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:03.299279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:05.747930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:08.936331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:12.078935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:15.210541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:17.658555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:20.657826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:01.264143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:03.576441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:05.986323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:09.201939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:12.409437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:15.538975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-07-16T21:39:18.199480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-07-16T21:39:40.498569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
yearmileagefuel_capacitycc_displacementbhptorquepriceengine_litresmakemodelcolorbody_stylenum_ownersseating_capacityfuel_typeengine_typetransmission_gearstransmission_typeemission_class
year1.000-0.644-0.0580.0600.1390.0930.569-0.0900.2090.4090.0980.1760.1840.1230.1310.3850.1190.2100.455
mileage-0.6441.0000.2590.1670.1140.205-0.2060.2470.1210.2200.0000.0000.1570.0440.2160.2360.0000.0700.187
fuel_capacity-0.0580.2591.0000.6090.7300.7830.5850.6080.5090.9230.2120.4590.0000.5380.3440.7660.3090.1120.234
cc_displacement0.0600.1670.6091.0000.7580.7340.6200.8490.4740.8050.1360.3710.0590.4760.4500.7750.3020.2070.154
bhp0.1390.1140.7300.7581.0000.8720.7780.7610.4710.8000.1270.4000.0140.1740.2900.7450.3810.2140.163
torque0.0930.2050.7830.7340.8721.0000.7390.7380.4190.7890.1190.4100.0790.2200.6460.7770.3590.2440.146
price0.569-0.2060.5850.6200.7780.7391.0000.5150.4450.6140.1260.3760.1220.1180.2680.6140.3420.2870.228
engine_litres-0.0900.2470.6080.8490.7610.7380.5151.0000.5090.8100.1280.4100.1020.1800.7330.7170.3680.2010.266
make0.2090.1210.5090.4740.4710.4190.4450.5091.0000.9600.1520.3810.0820.3280.2940.8820.3150.1550.252
model0.4090.2200.9230.8050.8000.7890.6140.8100.9601.0000.3040.9540.0840.9340.4860.7460.4960.5440.720
color0.0980.0000.2120.1360.1270.1190.1260.1280.1520.3041.0000.1850.0000.0530.0000.2890.0610.1020.179
body_style0.1760.0000.4590.3710.4000.4100.3760.4100.3810.9540.1851.0000.0260.3400.2930.6980.2770.2290.118
num_owners0.1840.1570.0000.0590.0140.0790.1220.1020.0820.0840.0000.0261.0000.0000.0000.0000.0850.0790.080
seating_capacity0.1230.0440.5380.4760.1740.2200.1180.1800.3280.9340.0530.3400.0001.0000.1400.8710.1300.0700.098
fuel_type0.1310.2160.3440.4500.2900.6460.2680.7330.2940.4860.0000.2930.0000.1401.0000.5810.2520.0000.000
engine_type0.3850.2360.7660.7750.7450.7770.6140.7170.8820.7460.2890.6980.0000.8710.5811.0000.5490.4010.609
transmission_gears0.1190.0000.3090.3020.3810.3590.3420.3680.3150.4960.0610.2770.0850.1300.2520.5491.0000.2910.060
transmission_type0.2100.0700.1120.2070.2140.2440.2870.2010.1550.5440.1020.2290.0790.0700.0000.4010.2911.0000.182
emission_class0.4550.1870.2340.1540.1630.1460.2280.2660.2520.7200.1790.1180.0800.0980.0000.6090.0600.1821.000

Missing values

2023-07-16T21:39:21.215910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-07-16T21:39:22.623887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-07-16T21:39:23.283339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

namemakemodelyearcolorbody_stylemileagenum_ownersseating_capacityfuel_typefuel_capacityengine_typecc_displacementtransmission_gearstransmission_typebhptorquefuel_economyemission_classpriceengine_litresdrive_train
0highline at (d)volkswagenameo2017silversedan4461115d45tdi14987a109.00250.0021.6646570001.5<NA>
1sxhyundaii20 active2016redcrossover2030515p45kappa11975m82.00115.0017.1956820001.2<NA>
2vxhondawr-v2019whitesuv2954025p40i-vtec11995m88.50110.0016.54793000NaN<NA>
3rxt amtrenaultkwid2017bronzehatchback3568015p289995m67.0091.0021.744140001.0<NA>
4astahyundaigrand i102017orangehatchback2512615p43kappa vtvt11975m81.86113.7518.955150001.2<NA>
5sportzhyundaielite i202016redhatchback5226115p45kappa vtvt11975m81.83114.7018.646040001.2<NA>
6v mthondabrio2012greyhatchback2810825p354 cylinder inline11985m86.80109.0019.43316000NaN<NA>
7xztataharrier2019greysuv9260315d50kryotec turbocharge19566a138.00350.0017.0414190002.0<NA>
8sportz amt vtvthyundaigrand i10 nios2021bluehatchback1630415p37kappa11975m81.86113.7520.0747100001.2<NA>
9rxt optrenaultkwid2019bronzehatchback2635025p289995m67.0091.0022.043920001.0<NA>
namemakemodelyearcolorbody_stylemileagenum_ownersseating_capacityfuel_typefuel_capacityengine_typecc_displacementtransmission_gearstransmission_typebhptorquefuel_economyemission_classpriceengine_litresdrive_train
966lximaruti suzukiswift2021bluehatchback938415p42k series11975a85.80114.018.64590000NaN<NA>
967zxi plusmaruti suzukiswift2020greyhatchback610415p37vtvt11975a81.80113.021.214838000NaN<NA>
968zxi amtmaruti suzukiswift2019greyhatchback2199225p37vtvt11975a81.80113.021.214829000NaN<NA>
969rxt dual tonerenaultcaptur2018orangesuv5413715p50h4k14985a105.00142.013.8757990001.5<NA>
970zxi amtmaruti suzukidzire2017bluesedan3325815p3711975a88.50113.023.2657220001.2<NA>
971vtvt sxhyundaiverna2018whitesedan2386915p4315916a121.00158.017.459560001.6fwd
972vtvt sxhyundaiverna2019whitesedan1483115p4315916a121.00158.017.4510270001.6fwd
973fluidic vtvt sx athyundaiverna2014silversedan5284615p4315914a121.00158.017.0137360001.6fwd
974flair editionfordfreestyle2020whitecrossover2833525d4214995a98.96215.018.557490001.5<NA>
975alpha amtmaruti suzukiignis2018silverhatchback4117615p32vvt11975a81.80113.020.8956910001.2<NA>